The knockoff filter for FDR control in group-sparse and multitask regression
نویسندگان
چکیده
We propose the group knockoff filter, a method for false discovery rate control in a linear regression setting where the features are grouped, and we would like to select a set of relevant groups which have a nonzero effect on the response. By considering the set of true and false discoveries at the group level, this method gains power relative to sparse regression methods. We also apply our method to the multitask regression problem where multiple response variables share similar sparsity patterns across the set of possible features. Empirically, the group knockoff filter successfully controls false discoveries at the group level in both settings, with substantially more discoveries made by leveraging the group structure.
منابع مشابه
A Pseudo Knockoff Filter for Correlated Features
In 2015, Barber and Candès introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and prove that this method achieves exact FDR control. Inspired by the work of Barber and Candès (2015), we propose and analyze a pseudoknockoff filter that inherits some advantages of the original knockoff filter and has more flexibility in constructing ...
متن کاملSome Analysis of the Knockoff Filter and its Variants
In many applications, we need to study a linear regression model that consists of a response variable and a large number of potential explanatory variables and determine which variables are truly associated with the response. In 2015, Barber and Candès introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and proved that this method a...
متن کاملCommunication-Efficient False Discovery Rate Control via Knockoff Aggregation
The false discovery rate (FDR)—the expected fraction of spurious discoveries among all the discoveries—provides a popular statistical assessment of the reproducibility of scientific studies in various disciplines. In this work, we introduce a new method for controlling the FDR in meta-analysis of many decentralized linear models. Our method targets the scenario where many research groups—possib...
متن کاملA knockoff filter for high-dimensional selective inference
This paper develops a framework for testing for associations in a possibly high-dimensional linear model where the number of features/variables may far exceed the number of observational units. In this framework, the observations are split into two groups, where the first group is used to screen for a set of potentially relevant variables, whereas the second is used for inference over this redu...
متن کاملSparse regression and marginal testing using cluster prototypes.
We propose a new approach for sparse regression and marginal testing, for data with correlated features. Our procedure first clusters the features, and then chooses as the cluster prototype the most informative feature in that cluster. Then we apply either sparse regression (lasso) or marginal significance testing to these prototypes. While this kind of strategy is not entirely new, a key featu...
متن کامل